<div class="tmd-doc">
<p></p>
<h1 class="tmd-header-1">
How to take addresses of the bytes in a string?
</h1>
<p></p>
<div class="tmd-usual">
We all know that string bytes are not addressable, so Go compilers will report for <code class="tmd-code-span">&amp;aString[i]</code> alike expressions.
</div>
<p></p>
<h2 class="tmd-header-2">
Use <code class="tmd-code-span">unsafe.StringData</code>
</h2>
<p></p>
<div class="tmd-usual">
Prior to Go 1.22 (and since 1.20), the only way to take the addresses of string bytes is through the unsafe way:
</div>
<p></p>
<pre class="tmd-code">
<code class="language-Go">package main

import (
	"unsafe"
)

var h = "Hello "
var w = "world!"

func commonUnderlyingBytes(x, y string) bool {
	return &amp;[]byte(x)[4] == &amp;[]byte(y)[4]
}

func main() {
	var s = h + w
	//var p = &amp;s[4] // error: cannot take address of s[4]
	var p = unsafe.StringData(s[4:])
	println(p, string(*p)) // 0xc00004470c o
}
</code></pre>
<p></p>
<h2 class="tmd-header-2">
Use <code class="tmd-code-span">reflect.Value.UnsafePointer</code>
</h2>
<p></p>
<div class="tmd-usual">
Since Go 1.23, the <code class="tmd-code-span">reflect.Value.UnsafePointer</code> method has supported string values. So we can use it to get addresses of string bytes.
</div>
<p></p>
<pre class="tmd-code">
<code class="language-Go">package main

import (
	"reflect"
)

var h = "Hello "
var w = "world!"

func commonUnderlyingBytes(x, y string) bool {
	return &amp;[]byte(x)[4] == &amp;[]byte(y)[4]
}

func main() {
	var s = h + w
	//var p = &amp;s[4] // error: cannot take address of s[4]
	var p = (*byte)(reflect.ValueOf(s[4:]).UnsafePointer())
	println(p, string(*p)) // 0xc000104eec o
}
</code></pre>
<p></p>
<div class="tmd-usual">
The above two ways both involve <code class="tmd-code-span">unsafe.Pointer</code>. The following one doesn't.
</div>
<p></p>
<h2 class="tmd-header-2">
Convert string to a byte slice, then take addresses of the byte slice
</h2>
<p></p>
<div class="tmd-usual">
Since Go toolchain 1.22, <a href="https://github.com/golang/go/issues/2205">an optimization</a> has been made so that memory allocation and bytes duplication are not needed any more in some <code class="tmd-code-span">[]byte(aString)</code> conversions (if the bytes in the conversion result are proven to be never modified).
</div>
<p></p>
<p></p>
<div class="tmd-usual">
We can make use of this optimization to take addresses of the bytes of a string, by converting the string to a byte slice then indirectly take addresses of the bytes in the byte slice. Here is an example demonstrating the usage:
</div>
<p></p>
<pre class="tmd-code">
<code class="language-Go">package main

var h = "Hello "
var w = "world!"

func takeAddressOfFirstByte(s string) *byte {
	return &amp;[]byte(s)[0]
}

func commonUnderlyingBytes(x, y string) bool {
	return len(x) != 0 &amp;&amp; len(y) != 0 &amp;&amp;
		takeAddressOfFirstByte(x) == takeAddressOfFirstByte(y)
}

func main() {
	var s = h + w

	var p1 = &amp;[]byte(s)[4]
	println(p1, string(*p1))
	
	var p2 = &amp;[]byte(s)[4]
	println(p2, string(*p2))

	println(p1 == p2)
	
	var s2 = s
	println(commonUnderlyingBytes(s[4:], s2[4:]))
}
</code></pre>
<p></p>
<div class="tmd-usual">
Run it with Go toolchain 1.21 and 1.22 versions: <sup><a id="fn:gotv:ref-1" href="#fn:gotv">[1]</a></sup>
</div>
<p></p>
<pre class="tmd-code">
$ gotv 1.21. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.21.12/bin/go run demo1.go
0xc00003c6c4 o
0xc00003c6a4 o
false
false
$ gotv 1.22. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.22.5/bin/go run demo1.go
0xc0000426ec o
0xc0000426ec o
true
true
</pre>
<p></p>
<div class="tmd-usual">
Wait! Does this mean we can modify string content now? Don't worry. The answer is certainly not. The compiler can successfully detect such attempts and disable the optimimation for the invloved conversions, as the following code demonstrates:
</div>
<p></p>
<pre class="tmd-code">
<code class="language-Go">package main

var h = "Hello "
var w = "world!"

func main() {
	var s = h + w

	var p1 = &amp;[]byte(s)[4]
	println(p1, string(*p1))
	
	var p2 = &amp;[]byte(s)[4]
	*p2 = 'X' // modify the byte
	println(p2, string(*p2))

	println(p1 == p2)
}
</code></pre>
<p></p>
<div class="tmd-usual">
The behaviors are identical between Go toolchain v1.21 and v1.22 versions. <sup><a id="fn:gotv:ref-2" href="#fn:gotv">[1]</a></sup>
</div>
<p></p>
<pre class="tmd-code">
$ gotv 1.21. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.21.12/bin/go run demo2.go
0xc00003c6d4 o
0xc00003c6b4 X
false
$ gotv 1.22. run aa.go
[Run]: $HOME/.cache/gotv/tag_go1.22.5/bin/go run demo2.go
0xc0000426f0 o
0xc0000426cc X
false
</pre>
<p></p>
<div class="tmd-usual">
So to make this way work, we should never change the bytes in the result byte slice, even potentially.
</div>
<p></p>
<div class="tmd-usual">
Is this trick useful in practice? I think the answer is yes. If one of two strings is derived from another (through substring operation), then to get their common prefix, the following <code class="tmd-code-span">CommonStringPrefix1</code> function is more performant than the <code class="tmd-code-span">CommonStringPrefix2</code> function (we don't there is a derivation relation between thetwo strings).
</div>
<p></p>
<pre class="tmd-code">
<code class="language-Go">func CommonStringPrefix1(x, y string) int {
	if len(x) == 0 || len(y) == 0 {
		return 0
	}
	if &amp;[]byte(x)[0] == &amp;[]byte(y)[0] {
		if len(x) &lt;= len(y) {
			return len(x)
		} else {
			return len(y)
		}
	}
	return CommonStringPrefix2(x, y)
}

func CommonStringPrefix2(x, y string) int {
	var min = len(x)
	if len(y) &lt; len(x) {
		min = len(y)
	}
	x2, y2 := x[:min], y[:min]
	if len(x2) == len(y2) {
		for i := 0; i &lt; len(x2); i++ {
			if x2[i] != y2[i] {
				return i
			}
		}
	}
	return min
}
</code></pre>
<p></p>
<hr  class="tmd-seperator"/>
<p></p>

<ol class="tmd-list tmd-footnotes" id="fn:">
<li id="fn:gotv" class="tmd-list-item tmd-footnote-item">
<div id="gotv" class="tmd-usual">
GoTV (<a href="https://go101.org/apps-and-libs/gotv.html">https://go101.org/apps-and-libs/gotv.html</a>) is a tool used to install and use coexisting installations of Go toolchain verisons harmoniously.
</div>
<a href="#fn:gotv:ref-1">↩︎</a><a href="#fn:gotv:ref-2">↩︎</a></li>
</ol>
</div>
